Search results for "Association rule learning"
showing 10 items of 14 documents
Exceptional Pattern Discovery
2017
This chapter is devoted to a discussion on exceptional pattern discovery, namely on scenarios, contexts, and techniques concerning the mining of patterns which are so rare or so frequent to be considered as exceptional and, then, of interest for an expert to shed lights on the domain. Frequent patterns have found broad applications in areas like association rule mining, indexing, and clustering [1, 20, 23]. The application of frequent patterns in classification also achieved some success in the classification of relational data [6, 13, 14, 19, 25], text [15], and graphs [7]. The part is organized as follows. First, the frequent pattern mining on classical datasets is presented. This is not …
3D Matrix-Based Visualization System of Association Rules
2017
With the growing number of mining datasets, it becomes increasingly difficult to explore interesting rules because of the large number of resultant and its nature complexity. Studies on human perception and intuition show that graphical representation could be a better illustration of how to seek information from the data using the capabilities of human visual system. In this work, we present and implement a 3D matrix-based approach visualization system of association rules. The main visual representation applies the extended matrix-based approach with rule-to-items mapping to general transaction data set. A novel method merging rules and assigning weight is proposed in order to reduce the …
Discovering representative models in large time series databases
2004
The discovery of frequently occurring patterns in a time series could be important in several application contexts. As an example, the analysis of frequent patterns in biomedical observations could allow to perform diagnosis and/or prognosis. Moreover, the efficient discovery of frequent patterns may play an important role in several data mining tasks such as association rule discovery, clustering and classification. However, in order to identify interesting repetitions, it is necessary to allow errors in the matching patterns; in this context, it is difficult to select one pattern particularly suited to represent the set of similar ones, whereas modelling this set with a single model could…
Predicting hospital associated disability from imbalanced data using supervised learning.
2019
Hospitalization of elderly patients can lead to serious adverse effects on their functional capability. Identifying the underlying factors leading to such adverse effects is an active area of medical research. The purpose of the current paper is to show the potential of artificial intelligence in the form of machine learning to complement the existing medical research. This is accomplished by studying the outcome of hospitalization of elderly patients as a supervised learning task. A rich set of features characterizing the medical and social situation of elderly patients is leveraged and using confusion matrices, association rule mining, and two different classes of supervised learning algo…
Business models to offer customized output in electronic commerce
2003
Seller-driven business models (e.g. online bookstores) have been successfully implemented and concretized in Electronic Commerce both in practice and science in the last years. In contrast to this we can depict that more customer-driven business models are implemented in the beginning. One major problem of customizable products and services in Electronic Commerce can be found in the adaptation of the human advisory activity which is inevitable in the traditional sale. For this reason we depict the customization in the customer's view and the corresponding business models in electronic markets. Main focus will be on the improvement of the communication interface between customer and seller i…
Does relevance matter to data mining research?
2008
Data mining (DM) and knowledge discovery are intelligent tools that help to accumulate and process data and make use of it. We review several existing frameworks for DM research that originate from different paradigms. These DM frameworks mainly address various DM algorithms for the different steps of the DM process. Recent research has shown that many real-world problems require integration of several DM algorithms from different paradigms in order to produce a better solution elevating the importance of practice-oriented aspects also in DM research. In this chapter we strongly emphasize that DM research should also take into account the relevance of research, not only the rigor of it. Und…
Hints from the Crowd: A Novel NoSQL Database
2013
The crowd can be an incredible source of information. In particular, this is true for reviews about products of any kind, freely provided by customers through specialized web sites. In other words, they are social knowledge, that can be exploited by other customers. The Hints From the Crowd HFC prototype, presented in this paper, is a NoSQL database system for large collections of product reviews; the database is queried by expressing a natural language sentence; the result is a list of products ranked based on the relevance of reviews w.r.t. the natural language sentence. The best ranked products in the result list can be seen as the best hints for the user based on crowd opinions the revi…
New Similarity Rules for Mining Data
2006
Variability and noise in data-sets entries make hard the discover of important regularities among association rules in mining problems. The need exists for defining flexible and robust similarity measures between association rules. This paper introduces a new class of similarity functions, SF's, that can be used to discover properties in the feature space X and to perform their grouping with standard clustering techniques. Properties of the proposed SF's are investigated and experiments on simulated data-sets are also shown to evaluate the grouping performance.
Expert-based versus citation-based ranking of scholarly and scientific publication channels
2016
Abstract The Finnish publication channel quality ranking system was established in 2010. The system is expert-based, where separate panels decide and update the rankings of a set of publications channels allocated to them. The aggregated rankings have a notable role in the allocation of public resources into universities. The purpose of this article is to analyze this national ranking system. The analysis is mainly based on two publicly available databases containing the publication source information and the actual national publication activity information. Using citation-based indicators and other available information with association rule mining, decision trees, and confusion matrices, …
A probabilistic condensed representation of data for stream mining
2014
Data mining and machine learning algorithms usually operate directly on the data. However, if the data is not available at once or consists of billions of instances, these algorithms easily become infeasible with respect to memory and run-time concerns. As a solution to this problem, we propose a framework, called MiDEO (Mining Density Estimates inferred Online), in which algorithms are designed to operate on a condensed representation of the data. In particular, we propose to use density estimates, which are able to represent billions of instances in a compact form and can be updated when new instances arrive. As an example for an algorithm that operates on density estimates, we consider t…